-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify Copy-On-Write and Spilling #15436
Conversation
def __init__( | ||
self, | ||
ptr: int, | ||
size: int, | ||
owner: object, | ||
exposed: bool, | ||
): | ||
"""Create a new buffer owner. | ||
|
||
Do not use this directly, instead use `_from_device_memory` or | ||
`_from_host_memory`. | ||
|
||
Raises | ||
------ | ||
ValueError | ||
If size is negative | ||
""" | ||
if size < 0: | ||
raise ValueError("size cannot be negative") | ||
|
||
self._ptr = ptr | ||
self._size = size | ||
self._owner = owner | ||
self._exposed = exposed | ||
self._slices = weakref.WeakSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice, I have removed the direct-use-of-Buffer guard. I think the four mandatory arguments are enough to prevent accidental use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small suggestions but overall this looks good to me.
Co-authored-by: Bradley Dice <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm excited to have this happen! Thanks for pushing through Mads.
We should also not forget about these two bits of work proposed from the last round:
- Refactoring of Buffers (last step towards unifying COW and Spilling) #13801 (comment)
- Refactoring of Buffers (last step towards unifying COW and Spilling) #13801 (comment)
I also wonder if there is a sensible way to unify the SpillableBufferOwner with the BufferOwner (current names, which will be changed in the future). Not in scope for this PR, but some of my prior concerns about the Liskov substitutability of different Buffer types still apply there (also CC @wence-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor
@property | ||
def exposed(self) -> bool: | ||
return self._owner.exposed | ||
self.owner._slices.add(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: should we assert isinstance(owner, BufferOwner)
here? Or is it not necessary because we implicitly require it via the _slices
attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess, we could add assert isinstance(owner, BufferOwner)
in Buffer.__init__
but I am not sure if that is pythonic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An assertion would basically be there to protect against developer error. There's nothing unpythonic about it, that's definitely what the functionality is there for. The main question I would have is, how would we end up with something other than a BufferOwner here, and how bad would the failure mode be? Unless the assertion makes it markedly easier to debug the code it doesn't add a whole lot of value IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is arguable a little unpythonic to prevent duck-typing by using isinstance
to force a specify type (as opposed to a general ABC/protocol type).
Anyways, in this case as_buffer()
is the only one creating ExposureTrackedBuffer
and it explicitly sets the owner
to SpillableBufferOwner | SpillableBuffer | BufferOwner
so I don't think a type error here is likely.
Co-authored-by: Lawrence Mitchell <[email protected]>
Regarding renaming:
I think we should do it in a follow up PR, it will touch a lot of files throughout cudf! Regarding #13801 (comment):
I agree, if dask or ucx should handle this, we need to publish a buffer API that includes this |
Happy to address one or both of those in a follow-up, especially since the second one apparently requires further discussion (I haven't given it any more thought yet). For the first, we should try to turn that around quickly to minimize conflicts since as you say it will touch a lot of files. Once you have that PR ready (after this one merges) feel free to ping me aggressively for reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this when Lawrence is, and when Prem is satisfied with the c-o-w testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to gooo!!
/merge |
Description
This is the final step to unify COW and spilling. Now,
SpillableBuffer
inherits fromExposureTrackedBuffer
so the final class hierarchy becomes:Additionally, spill-on-demand is now set globally using
set_spill_on_demand_globally()
instead of in theSpillManager
constructor.Checklist